Summar of implementations of PipelineStep

Name Comments
MultiStep A step which consists of a number of steps, each given the same inputs
ExcelOutputStep Is intended to be used by other steps to create, populate, and output an excel document dynamically This class has methods to create a workbook which can then be manipulated from a previous step
SetAttributesStep Selects the next step from a list of options Configured with a map of next steps, where the key for each step is used to select it Provide with either a column number or an attribute name to use to get the value to select the next step
FixedWidthInput Parses fixed width text files into lines of values which can be fed to row processing steps
CsvOutput Converts incoming rows into a CSV stream sent to the pipeline output
CsvInput Parses CSV files into lines of discrete values which are fed to row processing steps
TransactionStep Wraps processing in a transaction and commits it in the finished phase

Use alwaysRollback for testing, so the transaction is always rolled back

Use the isolated property so that the transaction is local to each invocation of this step. Ie if the transaction is called for each of 10 rows, the transaction will be started and commited 10 times.
This is useful for sequences of transaction steps, otherwise you will get a "nested transactions not supported" error.
WebServiceStep
JsRowStep Executes javascript for each row. The js is expected to transform or aggregate incoming data, probably holding stateful information between calls to writeRow. Output will be generted from the js by calling writeRow(..) Script context variables:
  • pipeline - a Pipeline object
  • thisStep - also a referene to the current Pipeline
  • formatter - the Formatter
  • nextStep - a NextStep object, representing the next step to be executed after the current
  • log - the logger for generating console log output
  • applications - an Applications object, use it to reference other apps and their services, eg applications.search.searchManager.search(...)
  • fileManager - a FileManager object
Here's a fairly common example:
 <RecordExecutionStep>
 <preventDuplicates>false</preventDuplicates>
 <execIdTemplate>StarClass-Dealer-Targets_2015</execIdTemplate>
 <next class="TransactionStep"
 alwaysRollback="false"> <next
 class="ExcelInputStep"> <useXml>true</useXml>
 <processRows>true</processRows> <nextSheetSteps>
 <NextSheetStep sheetNum="0" startRow="2">
  <next class="JsRowStep"> <!-- Execute the next
 step for each month, setting the month number into attribute monthNum -->
 <jsPath>/integration/integration.js</jsPath>
 <execFn>foreachMonth</execFn>     
 <next class="SalesDataInserter" mode="updateOrInsert"
 logInserts="true" logUpdates="true">
 <seriesName>star-class-partspurchase-data</seriesName> <column
 field="attributedTo" columnName="D"> <expr> org
 = pipeline.thisOrg.findChildOrgByField("StoreCode", value); if(org
 != null){ return org.orgId; } return null; </expr> </column>
 <column field="amount" > <!-- column G + 10*month -->
 <expr><![CDATA[ col = 2 +
 formatter.toInteger(pipeline.attributes.monthNum); if (0 >= row[col]) {
 return 1; } else { return row[col]; } ]]></expr> </column>

 <column field="fromDate">
 <expr>formatter.monthStart(formatter.now,
 pipeline.attributes.monthNum)</expr> </column> <column
 field="toDate"> <expr>formatter.monthEnd(formatter.now,
 pipeline.attributes.monthNum)</expr> </column> </next>

 </next> </NextSheetStep> </nextSheetSteps> </next>
 </next> </RecordExecutionStep>
TemplateOutput Execute a velocity template, writing the result to the output. This can be used for generating a result which returnes warnings etc
ResultEmailPipelineStep
ExcelInputStep Parses incoming stream as an excel workbook (xslx). This step can be configured to pass on the entire workbook, or just a single sheet, or it can process rows optionally skipping headers. You can specify a nexts step for each sheet (either use sheetNum or sheetName) <ExcelInputStep> <skipHeaderRow>false</skipHeaderRow> <processRows>false</processRows> <nextSheetSteps> <ExcelInputStep sheetNum="0"> <next class="JsRowStep"/> </ExcelInputStep> </nextSheetSteps> <useXml>false</useXml> </ExcelInputStep> Or a single next step which will be given the entire workbook as a org.apache.poi.ss.usermodel.Workbook <ExcelInputStep> <skipHeaderRow>false</skipHeaderRow> <processRows>false</processRows> <next class="JsRowStep"/> </ExcelInputStep>
DatabaseSourceStep Executes a query against the kademi database and feeds each resulting object to the next step Select the appropriate query source with providerId
DecisionStep Selects the next step from a list of options Configured with a map of next steps, where the key for each step is used to select it Provide with either a column number or an attribute name to use to get the value to select the next step
DatabaseUpdateStep Intended to receive rows. Updates, deletes or inserts into the selected provider based on the mode property which can be - update, insert, delete, updateOrInsert Update, Delete and updateOrInsert operations require an identifier This step works with a provider that provides access to a particular table. Available providers are:
MapStep Adds rows to a map keyed on a value generated by an expression. The map is set into a pipeline attribute Rows are also passed through unchanged to the next step Example: <MultiStep> <nextSteps> <DatabaseSourceStep providerId="survey"> <next class="co.kademi.server.integration.impl.MapStep"> <keyColumn column="0"> <expr>value.profile.userId</expr> </keyColumn> <valueColumn column="0"/> <keyAttribute>r1</keyAttribute> </next> <column field="rewardNames"> <value class="string">r17744</value> </column> </DatabaseSourceStep> <DatabaseSourceStep providerId="survey"> <next class="ExcelOutputStep"> <firstSheetTitle>My first sheet</firstSheetTitle> <headers> <string>ID</string> <string>Reward</string> <string>First Name</string> <string>Last Name</string> <string>r1 Pets name</string> <string>r2 Pets name</string> </headers> <column field="id" column="0"> <expr>value.id</expr> </column> <column field="id" column="0"> <expr>value.reward.name</expr> </column> <column field="firstName" column="0"> <expr>value.profile.?firstName</expr> </column> <column field="surName" column="0"> <expr>value.profile.?surName</expr> </column> <column field="ReccomendBasedOnRego" column="0"> <expr>value.answers.?answer_petsName</expr> </column> <column field="ReccomendBasedOnRego" column="0"> <expr>pipeline.attributes.r1[value.profile.userId].answers.?answer_petsName</expr> </column> </next> <column field="rewardNames" > <value class="string">r25562</value> </column> </DatabaseSourceStep> </nextSteps> </MultiStep>
SalesDataInserter Creates a new sales data record for each call. Requires the following arguments
  • Amount
  • Attributed to - orgId for an organisation, or email or userid for a profile
  • From - date/time for the beginning of the period
  • To - date/time for the end of the period. Same as from if the data is for a single point in time
And any extra fields defined on the dataseries The columns list maps fields onto column numbers. The standard fields are: amount, attributedTo, fromDate, toDate
RunPointsAllocationSourcesPipelineStep
RecordExecutionStep Wraps processing in a transaction and commits it in the finished phase This allows executions to be tagged with an execution ID, which is intended to unqiuely identify an import or export of this pipeline. This can be used to prevent multiple processings of the same data. For example, you might have a file containing points being loaded once each day. You might use an execution ID template which evaluates to (for example) "points-31012015". Then if another execution generated the same ID you would conclude this is a double-import and throw an error.
VelocityOutputStep Generates HTML template and outputs it to the pipeline output